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Abstract. We present new techniques to perform adaptive spatial 
binning of two-dimensional (2D) data to reach a chosen constant signal- 
to-noise ratio per bin. These methods are required particularly for the 
proper analysis of Integral Field Spectrograph (IFS) observations, but 
can also be used for standard photometric imagery. Various schemes are 
tested and compared using data obtained with the panoramic IFS SAURON. 



1. Introduction 

Spatially resolved astronomical observations commonly span orders of magni- 
tude variations in the signal-to-noise (S/N) across the detector elements. For 
this reason data are often grouped together in the spatial direction (binned) and 
averaged before analyzing them. More spatial resolution is retained in the high- 
S/N regions compared to the low-S/N ones. A well known example is galaxy 
photometry, where logarithmically spaced radial bins are often adopted. Many 
more pixels are used to compute the value of a galaxy profile at large radii than 
in the center. 

Binning is essential in the case of spectroscopic observations of the stellar 
kinematics. In fact a minimum S/N is required for a reliable and unbiased 
extraction of kinematical information from the spectra (e.g., Rix & White 1992; 
van der Marel &: Franx 1993; Kuijken & Merrifield 1993). For this reason binning 
is invariably used to analyze one-dimensional (ID, e.g., long slit) spectroscopic 
observations. New developments with Integral Field Spectroscopy (IFS; e.g., 
OASIS on CFHT, SAURON on WHT, VIMOS on VLT, GMOS on Gemini) require 
methods to perform spatial binning of spectra in two dimensions (2D) too. 

The related process of adaptively smoothing is not acceptable for our pur- 
poses since it correlates the information of different bins which complicates the 
quantitative interpretation of spectroscopic measurements. For the same reason 
smoothing is never used for the quantitative analysis of ID spectra. 

Little work has been done on the subject of adaptive 2D-binning. Sanders 
& Fabian (2001) developed an algorithm to be used with X-ray imaging data. 
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The bins that their method produces however can contain other bins and are not 
compact. In the case of spectroscopic data it makes little sense to bin together 
spectra coming from pixels that are not close to each other and whose properties 
may differ considerably. Other schemes have to be developed. 

2. Formulation of the Problem 

We tackle here the problem of binning in the spatial direction(s). In what follows 
the term 'pixel' refers to a given spatial element of the dataset: it can be an 
actual pixel of a CCD image, or a spectrum position along the slit of a long-slit 
spectrograph or in the field of view of an IFS. 

Each pixel i has an associated signal Si and its corresponding noise A/i . The 
pixel signal-to- noise ratio is therefore (S/N)j = Si /Mi. Our considerations do 
not depend on the details used to estimate these quantities, which we assume to 
be known beforehand for every pixel i. In the case of spectrography for instance, 
the quantity Si associated to a given spectrum Si (A) can be the average signal 
over a given spectral range AA: <Sj = i Jaa ^(-M ^A; while the corresponding 

average noise can be defined as Nf = Iax ^iW dA. It is important to note 
that, in our case, the term 'binning' will only refer here to the averaging of 
observations taken at different positions on the sky (i.e., different pixels), and 
not along the spectral direction. 

To bin in ID one only has to make sure that the spatially binned data 
satisfy a minimum S/N requirement (or better a minimum scatter around the 
target S/N). In 2D (and higher dimensions) the situation is more complex and 
a good binning scheme has to satisfy the following requirements: 

Topological requirement: the bins should at least properly tessellate the re- 
gion f2 of the plane under consideration, i.e., create a partition of O, with- 
out overlapping or holes. While this requirement is trivial to enforce in 
ID, it is tricky to implement in higher dimensions, as the bin shapes must 
then be taken into consideration; 

Morphological requirement: the bin shape has to be as 'compact' (or 'round') 
as possible, so that they are associated with a well-defined position, and 
correspond to the overall best spatial resolution; 

Uniformity requirement: the bin S /N should be as uniform as possible around 
a target value. While a minimum S/N is generally required, one does not 
want to sacrifice spatial resolution to increase the S/N even further. 

In what follows we consider different methods and we apply each one to ob- 
servations of the barred Sa galaxy NGC 2273 (Fig. [l]) taken with the panoramic 
IFS SAURON (Bacon et al. 2001). These observations, based on 8 single 1800 s 
exposures, have been selected for having high S/N contrast between the inner 
and the outer parts, and a complex S/N-distribution, caused by the presence of 
spiral arms, S/N-jumps and irregular outer boundaries due to the merging of 
multiple exposures. 
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Figure 1. SAURON observation of the barred Sa galaxy NGC 2273, 
based on two 4 x 1800 s pointings. The field-of-view is 49" x 44", with an 
effective spatial sampling of Cf.'8 x / .'8. Left panel: reconstructed total 
intensity. Right panel: S/N-map, the S/N being integrated along the 
whole SAURON spectral range. Red corresponds to an average S/N~50 
per resolution element. Note the galaxy spiral arms, as well as the two 
vertical S/N jumps close to the middle of the frame and the irregular 
boundaries due to the merging process of the different exposures. 



3. Quadtree Method 

It is useful to first consider the Quadtree algorithm (Samet 1984), which we 
believe is close to the best 'regular' image processing method available for the 
present application. We show that the Quadtree method cannot produce an 
optimal binning and more complex schemes are required. These we discuss in 
Section 4. 

The Quadtree method consists of a recursive partition of a region of the 
plane into axis-aligned squares. One square, the root, covers the entire region. 
A square is divided into four child squares, by splitting with horizontal and 
vertical segments through its center. The collection of squares then form a 
tree, with smaller squares at lower levels of the tree. The splitting of squares 
terminates when all squares meet some convergence criterion. 

In Fig. [2] the Quadtree method was used to rebin the S/N map of Fig. Q 
into squares satisfying a minimum S/N requirement. The nice feature of this 
binning method is that the resulting bins are squares of various sizes (except at 
the border). In this way bins are easy to handle, and require little information 
to be described completely. 

There are however two major problems with this method: 

• a S/N spread of a factor ~ 2 is unavoidable due to the fact that bin area 
varies by construction in steps of a factor of 4; 

• unless the original image has a size which is a power of two, some bins at 
the border will not be square and generally will not meet the minimum 
S/N criterion, becoming unusable for later analysis. 
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Figure 2. Left panel: Quadtree binning of NGC 2273 for a target 
S/N=50 (red line). Right panel: S/N variation in the Quadtree binning 
of NGC 2273. The open squares represent the S/N of each bin, as a 
function of its distance from the galaxy center. Note the factor 2x 
scatter in the S/N due to the factor 4x recursive splitting of the bins. 
The target S/N=50 is indicated by the red thick horizontal line. 

4. Voronoi Tessellation 

Given the inability of methods employing 'regular' bins to produce optimal 2D 
tessellations, we now consider schemes which do not have square or rectangular 
bins. Accordingly we consider the Voronoi Tessellation (VT), that can be used 
to generate binnings satisfying all the three requirements of Section 2. 

Given a region Q and a set of points {zi}^L l: called generators, in Q, a VT is 
a partition of Q into regions {Vi\f =l enclosing the points closer to zi than to any 
other generator. Each Vi is referred to as the Voronoi region or bin associated 
to Zi (see Okabe, Boots, & Sugihara 1992, for a comprehensive treatment). 

The VT presents many interesting features for the binning problem: 

• it naturally enforces the Topological requirement; 

• it is efficiently described by the coordinates of its generators; 

• it is very easy to implement in the discrete case: given the generator 
positions, it is sufficient to locate the closest generator to any given pixel 
to determine the bin to which it belongs. 

On the other hand, the fact that a VT is adopted for binning does not en- 
force by itself the Morphological requirement: the bins are convex by construc- 
tion, but can have very sharp angles. Furthermore, the Uniformity requirement 
is not addressed in any way by the use of a VT. These two requirements have 
to be tackled through a properly tailored distribution of the Voronoi generators. 
We present now a way to produce such a distribution. 
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Figure 3. Left panel: CVT-binning of the density p 2 (r), where p is 
obtained by linear interpolation from the surface brightness in Fig. |l]. 
This interpolation is used to 'simulate' an observation with very high 
spatial resolution. The VT generators are shown with the crosses. 
Right panel: S/N scatter in the CVT-binning of NGC 2273. The origi- 
nal S/N of the pixels is shown with the crosses, while the open squares 
represent the S/N of the final bins. The target S/N=50 is indicated by 
the red horizontal line. 



4.1. Centroidal Voronoi Tessellation 

The Centroidal Voronoi Tessellation (CVT) is a technique which can be used to 
generate an optimally uniform and regular VT in the continuous case, or in the 
limit of a large number of pixels. We initially assume that the observed signal 
S{ can be described by a continuous function p(r) in the sky-plane. Moreover 
we assume the noise iVj to be a monotonic function of Si (e.g., Poissonian noise 
Ni = \fS~i). With these assumptions the problem of generating equal-S/N bins 
reduces to that of producing a tessellation enclosing equal-mass, according to 
the density distribution p(r). 

Given a density function p(r) defined in a region f2, a CVT of O is a special 
class of VT where the generators z± happen to coincide with the mass centroids 
z* = f v . rp(r) dr/ J v , p(r) dr of the corresponding Voronoi regions Vj. As illus- 
trated in the review by Du, Faber, & Gunzburger (1999), the CVTs are useful 
to solve a variety of mathematical problems, but can also be observed in many 
real-life natural examples (living cells, territories of animals, etc.). 

One of the most striking characteristics of CVT in the 2D case is its ability 
to partition a region into bins whose size varies as a function of the underlying 
density distribution, but whose shape tends asymptotically to a hexagonal-like 
lattice for a large number of bins. Another nice feature of CVT is that a simple 
algorithm exists for its practical computation: the Lloyd (1982) method, for 
which the CVT is a fixed point. 

Although CVT bins are naturally smaller where the density is higher, the 
area-/? relation of the bins is not such that the mass enclosed in every bin is 
the same: the CVT cannot be used directly to produce equal mass bins (equal 
S/N in the case of photon noise). However, it can be shown that if a CVT is 
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Figure 4. Left panel: VT of the SAURON pixel grid of Fig. fj], obtained 
using the same generators as in Fig. [3|. Right panel: S/N scatter in the 
bins as a function of the distance of the bin centroid from the galaxy 
center. Note the significant increase of the scatter, compared to Fig. ||. 



constructed for the density p 2 , the tessellation obtained will enclose equal mass 
according to the original density p (see Cappellari & Copin in prep, for details). 

Fig. presents a CVT produced by applying the Lloyd's algorithm to the 
density p , where p was obtained by linearly interpolating the surface brightness 
of Fig. [I] onto a grid with pixel size 8x smaller than the original size. This 
interpolation is used here to 'simulate' an observation with a much higher spatial 
resolution, but is not an accurate way of dealing with lower resolution data. In 
this case of a large number of pixels the cells of the VT tend to the theoretical 
hexagonal shape and adapt nicely to density variations and to the irregular 
boundary of the region. The scatter of the S/N is also close to optimal with 
RMS scatter of ~ 4%. 

This CVT method illustrates the goals towards which an optimal 2D- 
binning algorithm should tend, but it has still some practical limitations: 

• it generates equal-mass bins and not necessary equal-S/N bins, unless the 
noise is a monotonic function of the signal. It is e.g., useful to produce 
equal S/N bins when all the noise in the pixel can be attributed to photon 
noise. This is often the case, but not always; 

• more importantly the method does not work well when the bins are con- 
stituted of just a few pixels. In Fig. ||| the same generators as in Fig. U 
where used to construct a VT for the coarser SAURON pixel grid of Fig. 
The obtained VT is similar to that of Fig. ||, but the S /N scatter increases 
considerably, due to discretization effects. 

In practice an optimal 2D-binning method has to preserve the good char- 
acteristics of the CVT, in the limit of many pixels, but has to be able to take 
the discrete nature of pixels into account, when dealing with bins constituted 
by just a few pixels. This algorithm is the subject of the next section. 
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Figure 5. Left panel: Final result for the bins after application of the 
bin-accretion algorithm to the SAURON S/N map of NGC 2273. Right 
panel: the bins S/N as a function of the distance from the galaxy 
center. Note the decrease of the S/N scatter compared to Fig. f|. 



4.2. Bin- Accretion Algorithm 

We describe a method we developed to find the generators for the optimal VT 
of an image or IFS observation taking the discrete nature of pixels into account 
from the beginning. The algorithm described here constructs an initial binning 
trying to generalize to 2D the standard pixel-by-pixel binning algorithm used 
in ID. The centroids of the bins found in this way are then used as starting 
generators for a CVT. The method reduces to the previous CVT in the limit of 
many pixels, but works on a pixel basis with bins made by just a few pixels. 

A natural ID-binning algorithm starts a bin from the highest S/N unbinned 
pixel and accretes pixels until a given target S/N is reached. To extend this idea 
to 2D, we need to make a choice for the direction toward which new pixels are 
accreted to a 2D-bin. The adopted method always tries to add to the current 
bin the pixel that is closest to the bin centroid. Furthermore, when a new bin is 
started, the first pixel is also selected as the one closest to the centroid of all the 
previously binned pixels. This simple scheme automatically tends to generate 
bins that are compact, and the bin S/N can be carefully monitored on a pixel- 
by-pixel basis during the accretion phase. Some small imperfections remain at 
the end of the accretion phase: these are corrected by performing a CVT, using 
as starting generators the centroids of the previous bins. 

An example of the application of the bin-accretion algorithm to the binning 
of the actual SAURON data of NGC 2273 is shown in the left panel of Fig. |, 
while the resulting S/N scatter is shown in the right panel: the RMS value is 
~ 6%. The S/N values, symmetrically clustered around the target S/N = 50, 
essentially represent the lowest S/N scatter obtainable from a binning of these 
data: all the scatter is due to discretization noise, which increases towards to 
the galaxy center, where the bin are made of a smaller number of pixels. The 
velocity field extracted from the binned data is shown in Fig. ||. 

The binning resulting from this algorithm is qualitatively similar to the one 
obtained using the CVT (see Fig. |3|). But, by contrast to the CVT alone, this 
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Figure 6. The stellar mean velocity field measured from the unbinned 
SAURON data of NGC 2273 (left panel), is compared to the interpolated 
velocities extracted from the 2D-binned spectra (right panel). The 
centroids of the bins used (Fig. |5|) are shown with the black dots. Some 
representative contours of the galaxy surface brightness are also shown. 



method is able to produce bins that are essentially optimal also in the 'small- 
bins' regime, with bins of only 2-4 pixels. 

5. Conclusions 

We presented a method to adaptively perform spatial binning of 2D data (e.g., 
IFS or imaging data). As an example of its application we have extracted the 
stellar mean velocity field from the 2D-binned SAURON data of the barred spiral 
galaxy NGC 2273. Adaptive 2D-binning should become common practice for 
the analysis of 2D data (in particular spectral data), as it is for ID observations. 
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